Skip to content

fix(ai): fix unsubstituted template vars + phantom tool in prompts (D1)#1259

Merged
ocervell merged 1 commit into
ai-resiliencyfrom
fix/prompt-template-drift
Jul 1, 2026
Merged

fix(ai): fix unsubstituted template vars + phantom tool in prompts (D1)#1259
ocervell merged 1 commit into
ai-resiliencyfrom
fix/prompt-template-drift

Conversation

@ocervell

@ocervell ocervell commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Finding D1 — Prompt template/tool drift (P4 Pertinence).

Problem

  1. Unsubstituted template vars. constraints/queries.txt (included by every mode) references $query_types and $output_types_reference, but get_system_prompt only substituted output_types_reference in chat mode and query_types in no mode. Result: the rendered system prompt leaked literal $query_types (all 3 modes) and $output_types_reference (attack + exploit) to the LLM.
  2. Phantom tool. constraints/common.txt <correct> example taught run_query(...) — a tool that does not exist. The real tool is query_workspace (TOOL_ACTION_MAP["query_workspace"] == "query"). The example JSON was also malformed (unbalanced braces, wrong _type shape).

Fix

  • secator/ai/prompts.py get_system_prompt: build one substitution dict with query_types=build_query_types() and output_types_reference=build_output_types_reference() for all modes (library_reference/path_vars still only for attack/exploit). Values derive from FINDING_TYPES, so no hardcoded list to drift. Uses existing safe_substitute.
  • secator/ai/prompts/constraints/common.txt:15: run_query({...}) -> query_workspace(query={"_type": "vulnerability", "severity": {"$in": ["high", "critical"]}}), matching the query_workspace examples already in queries.txt.

Value sources

  • $query_types <- build_query_types() (comma-joined cls.get_name() over FINDING_TYPES).
  • $output_types_reference <- build_output_types_reference() (same registry).
  • Real tool name confirmed against secator/ai/tools.py TOOL_ACTION_MAP (read-only).

Tests

tests/unit/test_ai_prompts.py: 38 -> 41 passed. Added 3 regression tests asserting every mode renders with no $query_types/$output_types_reference, that query_types renders to real registry names, and that prompts reference query_workspace not run_query.

Rendered-prompt check before/after:

mode $query_types $output_types_reference run_query
before (all) leaked leaked (attack/exploit) present
after (all) gone gone gone (query_workspace)

🤖 Generated with Claude Code

…xample (D1)

The queries.txt constraint (included by every mode) references $query_types
and $output_types_reference, but get_system_prompt only substituted
output_types_reference in chat mode and query_types nowhere — so the rendered
system prompt leaked literal $query_types / $output_types_reference to the LLM.

Substitute both for all modes, derived from FINDING_TYPES (build_query_types /
build_output_types_reference) so they can't drift from the registry.

Also fix the phantom run_query tool in common.txt's <correct> example: the real
tool is query_workspace (TOOL_ACTION_MAP query_workspace -> query), and the
example JSON was malformed. Corrected to a valid query_workspace call.

Adds regression tests asserting rendered prompts for every mode contain no
unsubstituted $query_types/$output_types_reference and reference query_workspace
rather than run_query.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2897d169-6e26-41d4-89d9-1c57f332af2c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/prompt-template-drift

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@ocervell ocervell merged commit 4b502f4 into ai-resiliency Jul 1, 2026
1 check passed
@ocervell ocervell deleted the fix/prompt-template-drift branch July 1, 2026 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant